40 research outputs found
Non-Rigid Structure from Motion
This thesis revisits a challenging classical problem in geometric computer vision known as "Non-Rigid Structure-from-Motion" (NRSfM). It is a well-known problem where the task is to recover the 3D shape and motion of a non-rigidly moving object from image data. A reliable solution to this problem is valuable in several industrial applications such as virtual reality, medical surgery, animation movies etc. Nevertheless, to date, there does not exist any algorithm that can solve NRSfM for all kinds of conceivable motion. As a result, additional constraints and assumptions are often employed to solve NRSfM. The task is challenging due to the inherent unconstrained nature of the problem itself as many 3D varying configurations can have similar image projections. The problem becomes even more challenging if the camera is moving along with the object.
The thesis takes on a modern view to this challenging problem and proposes a few algorithms that have set a new performance benchmark to solve NRSfM. The thesis not only discusses the classical work in NRSfM but also proposes some powerful elementary modification to it. The foundation of this thesis surpass the traditional single object NRSFM and for the first time provides an effective formulation to realise multi-body NRSfM.
Most techniques for NRSfM under factorisation can only handle sparse feature correspondences. These sparse features are then used to construct a scene using the organisation of points, lines, planes or other elementary geometric primitive. Nevertheless, sparse representation of the scene provides an incomplete information about the scene. This thesis goes from sparse NRSfM to dense NRSfM for a single object, and then slowly lifts the intuition to realise dense 3D reconstruction of the entire dynamic scene as a global as rigid as possible deformation problem.
The core of this work goes beyond the traditional approach to deal with deformation. It shows that relative scales for multiple deforming objects can be recovered under some mild assumption about the scene. The work proposes a new approach for dense detailed 3D reconstruction of a complex dynamic scene from two perspective frames. Since the method does not need any depth information nor it assumes a template prior, or per-object segmentation, or knowledge about the rigidity of the dynamic scene, it is applicable to a wide range of scenarios including YouTube Videos.
Lastly, this thesis provides a new way to perceive the depth of a dynamic scene which essentially trivialises the notion of motion estimation as a compulsory step to solve this problem. Conventional geometric methods to address depth estimation requires a reliable estimate of motion parameters for each moving object, which is difficult to obtain and validate. In contrast, this thesis introduces a new motion-free approach to estimate the dense depth map of a complex dynamic scene for successive/multiple frames. The work show that given per-pixel optical flow correspondences between two consecutive frames and the sparse depth prior for the reference frame, we can recover the dense depth map for the successive frames without solving for motion parameters. By assigning the locally rigid structure to the piece-wise planar approximation of a dynamic scene which transforms as rigid as possible over frames, we can bypass the motion estimation step. Experiments results and MATLAB codes on relevant examples are provided to validate the motion-free idea
Jumping Manifolds: Geometry Aware Dense Non-Rigid Structure from Motion
Given dense image feature correspondences of a non-rigidly moving object
across multiple frames, this paper proposes an algorithm to estimate its 3D
shape for each frame. To solve this problem accurately, the recent
state-of-the-art algorithm reduces this task to set of local linear subspace
reconstruction and clustering problem using Grassmann manifold representation
\cite{kumar2018scalable}. Unfortunately, their method missed on some of the
critical issues associated with the modeling of surface deformations, for e.g.,
the dependence of a local surface deformation on its neighbors. Furthermore,
their representation to group high dimensional data points inevitably introduce
the drawbacks of categorizing samples on the high-dimensional Grassmann
manifold \cite{huang2015projection, harandi2014manifold}. Hence, to deal with
such limitations with \cite{kumar2018scalable}, we propose an algorithm that
jointly exploits the benefit of high-dimensional Grassmann manifold to perform
reconstruction, and its equivalent lower-dimensional representation to infer
suitable clusters. To accomplish this, we project each Grassmannians onto a
lower-dimensional Grassmann manifold which preserves and respects the
deformation of the structure w.r.t its neighbors. These Grassmann points in the
lower-dimension then act as a representative for the selection of
high-dimensional Grassmann samples to perform each local reconstruction. In
practice, our algorithm provides a geometrically efficient way to solve dense
NRSfM by switching between manifolds based on its benefit and usage.
Experimental results show that the proposed algorithm is very effective in
handling noise with reconstruction accuracy as good as or better than the
competing methods.Comment: New version with corrected typo. 10 Pages, 7 Figures, 1 Table.
Accepted for publication in IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 2019. Acknowledgement added. Supplementary material is
available at https://suryanshkumar.github.io
Non-Rigid Structure from Motion: Prior-Free Factorization Method Revisited
A simple prior free factorization algorithm \cite{dai2014simple} is quite
often cited work in the field of Non-Rigid Structure from Motion (NRSfM). The
benefit of this work lies in its simplicity of implementation, strong
theoretical justification to the motion and structure estimation, and its
invincible originality. Despite this, the prevailing view is, that it performs
exceedingly inferior to other methods on several benchmark datasets
\cite{jensen2018benchmark,akhter2009nonrigid}. However, our subtle
investigation provides some empirical statistics which made us think against
such views. The statistical results we obtained supersedes Dai {\it{et
al.}}\cite{dai2014simple} originally reported results on the benchmark datasets
by a significant margin under some elementary changes in their core algorithmic
idea \cite{dai2014simple}. Now, these results not only exposes some unrevealed
areas for research in NRSfM but also give rise to new mathematical challenges
for NRSfM researchers. We argue that by \textbf{properly} utilizing the
well-established assumptions about a non-rigidly deforming shape i.e, it
deforms smoothly over frames \cite{rabaud2008re} and it spans a low-rank space,
the simple prior-free idea can provide results which is comparable to the best
available algorithms. In this paper, we explore some of the hidden intricacies
missed by Dai {\it{et. al.}} work \cite{dai2014simple} and how some elementary
measures and modifications can enhance its performance, as high as approx. 18\%
on the benchmark dataset. The improved performance is justified and empirically
verified by extensive experiments on several datasets. We believe our work has
both practical and theoretical importance for the development of better NRSfM
algorithms.Comment: Accepted for publication in IEEE, WACV 202
Multi-body Non-rigid Structure-from-Motion
Conventional structure-from-motion (SFM) research is primarily concerned with
the 3D reconstruction of a single, rigidly moving object seen by a static
camera, or a static and rigid scene observed by a moving camera --in both cases
there are only one relative rigid motion involved. Recent progress have
extended SFM to the areas of {multi-body SFM} (where there are {multiple rigid}
relative motions in the scene), as well as {non-rigid SFM} (where there is a
single non-rigid, deformable object or scene). Along this line of thinking,
there is apparently a missing gap of "multi-body non-rigid SFM", in which the
task would be to jointly reconstruct and segment multiple 3D structures of the
multiple, non-rigid objects or deformable scenes from images. Such a multi-body
non-rigid scenario is common in reality (e.g. two persons shaking hands,
multi-person social event), and how to solve it represents a natural
{next-step} in SFM research. By leveraging recent results of subspace
clustering, this paper proposes, for the first time, an effective framework for
multi-body NRSFM, which simultaneously reconstructs and segments each 3D
trajectory into their respective low-dimensional subspace. Under our
formulation, 3D trajectories for each non-rigid structure can be well
approximated with a sparse affine combination of other 3D trajectories from the
same structure (self-expressiveness). We solve the resultant optimization with
the alternating direction method of multipliers (ADMM). We demonstrate the
efficacy of the proposed framework through extensive experiments on both
synthetic and real data sequences. Our method clearly outperforms other
alternative methods, such as first clustering the 2D feature tracks to groups
and then doing non-rigid reconstruction in each group or first conducting 3D
reconstruction by using single subspace assumption and then clustering the 3D
trajectories into groups.Comment: 21 pages, 16 figure
Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective
This paper addresses the task of dense non-rigid structure-from-motion
(NRSfM) using multiple images. State-of-the-art methods to this problem are
often hurdled by scalability, expensive computations, and noisy measurements.
Further, recent methods to NRSfM usually either assume a small number of sparse
feature points or ignore local non-linearities of shape deformations, and thus
cannot reliably model complex non-rigid deformations. To address these issues,
in this paper, we propose a new approach for dense NRSfM by modeling the
problem on a Grassmann manifold. Specifically, we assume the complex non-rigid
deformations lie on a union of local linear subspaces both spatially and
temporally. This naturally allows for a compact representation of the complex
non-rigid deformation over frames. We provide experimental results on several
synthetic and real benchmark datasets. The procured results clearly demonstrate
that our method, apart from being scalable and more accurate than
state-of-the-art methods, is also more robust to noise and generalizes to
highly non-linear deformations.Comment: 10 pages, 7 figure, 4 tables. Accepted for publication in Conference
on Computer Vision and Pattern Recognition (CVPR), 2018, typos fixed and
acknowledgement adde
Organic Priors in Non-Rigid Structure from Motion
This paper advocates the use of organic priors in classical non-rigid
structure from motion (NRSfM). By organic priors, we mean invaluable
intermediate prior information intrinsic to the NRSfM matrix factorization
theory. It is shown that such priors reside in the factorized matrices, and
quite surprisingly, existing methods generally disregard them. The paper's main
contribution is to put forward a simple, methodical, and practical method that
can effectively exploit such organic priors to solve NRSfM. The proposed method
does not make assumptions other than the popular one on the low-rank shape and
offers a reliable solution to NRSfM under orthographic projection. Our work
reveals that the accessibility of organic priors is independent of the camera
motion and shape deformation type. Besides that, the paper provides insights
into the NRSfM factorization -- both in terms of shape and motion -- and is the
first approach to show the benefit of single rotation averaging for NRSfM.
Furthermore, we outline how to effectively recover motion and non-rigid 3D
shape using the proposed organic prior based approach and demonstrate results
that outperform prior-free NRSfM performance by a significant margin. Finally,
we present the benefits of our method via extensive experiments and evaluations
on several benchmark datasets.Comment: To appear in ECCV 2022 Conference (Oral Presentation). Draft info: 18
Pages, 4 Figures, and 6 Tables. Project webpage:
https://suryanshkumar.github.io/Organic_Prior_NRSfM
Robustifying the Multi-Scale Representation of Neural Radiance Fields
Neural Radiance Fields (NeRF) recently emerged as a new paradigm for object
representation from multi-view (MV) images. Yet, it cannot handle multi-scale
(MS) images and camera pose estimation errors, which generally is the case with
multi-view images captured from a day-to-day commodity camera. Although
recently proposed Mip-NeRF could handle multi-scale imaging problems with NeRF,
it cannot handle camera pose estimation error. On the other hand, the newly
proposed BARF can solve the camera pose problem with NeRF but fails if the
images are multi-scale in nature. This paper presents a robust multi-scale
neural radiance fields representation approach to simultaneously overcome both
real-world imaging issues. Our method handles multi-scale imaging effects and
camera-pose estimation problems with NeRF-inspired approaches by leveraging the
fundamentals of scene rigidity. To reduce unpleasant aliasing artifacts due to
multi-scale images in the ray space, we leverage Mip-NeRF multi-scale
representation. For joint estimation of robust camera pose, we propose
graph-neural network-based multiple motion averaging in the neural volume
rendering framework. We demonstrate, with examples, that for an accurate neural
representation of an object from day-to-day acquired multi-view images, it is
crucial to have precise camera-pose estimates. Without considering robustness
measures in the camera pose estimation, modeling for multi-scale aliasing
artifacts via conical frustum can be counterproductive. We present extensive
experiments on the benchmark datasets to demonstrate that our approach provides
better results than the recent NeRF-inspired approaches for such realistic
settings.Comment: Accepted for publication at British Machine Vision Conference (BMVC)
2022. Draft info: 13 pages, 3 Figures, and 4 Table
Quantum Annealing for Single Image Super-Resolution
This paper proposes a quantum computing-based algorithm to solve the single
image super-resolution (SISR) problem. One of the well-known classical
approaches for SISR relies on the well-established patch-wise sparse modeling
of the problem. Yet, this field's current state of affairs is that deep neural
networks (DNNs) have demonstrated far superior results than traditional
approaches. Nevertheless, quantum computing is expected to become increasingly
prominent for machine learning problems soon. As a result, in this work, we
take the privilege to perform an early exploration of applying a quantum
computing algorithm to this important image enhancement problem, i.e., SISR.
Among the two paradigms of quantum computing, namely universal gate quantum
computing and adiabatic quantum computing (AQC), the latter has been
successfully applied to practical computer vision problems, in which quantum
parallelism has been exploited to solve combinatorial optimization efficiently.
This work demonstrates formulating quantum SISR as a sparse coding optimization
problem, which is solved using quantum annealers accessed via the D-Wave Leap
platform. The proposed AQC-based algorithm is demonstrated to achieve improved
speed-up over a classical analog while maintaining comparable SISR accuracy.Comment: Accepted to IEEE/CVF CVPR 2023, NTIRE Challenge and Workshop. Draft
info: 10 pages, 6 Figures, 2 Table